install.packages("synapser", repos = c("http://ran.synapse.org"))
install.packages(c("tidyverse"))ELITE Portal Tutorial: Joining Metadata
Install Synapser if you have not already
Load libraries
library(synapser)
library(readr)
library(dplyr)
library(magrittr)Log in to Synapse
synLogin()Download and read in metadata files
There are 3 metadata files that you will want to have access to and potentially join together to understand data you are looking at on the ELITE Portal: Individual, Biospecimen, and Assay.
This example utilizes the three metadata files that can be found in the Study Details Page for the Mouse M005 Metabolomics Study.
The query below is generated when you download programmatically directly from the ELITE Portal.
# Download the results of the filtered table query
query <- synTableQuery("SELECT * FROM syn52234677 WHERE ( ( \"Study\" = 'Mouse_M005_Study_Metabolomics' ) ) AND ( `resourceType` = 'metadata' )")Downloaded syn52234677 to /Users/mklein/.synapseCache/876/154076876/SYNAPSE_TABLE_QUERY_154076876.csv
read.table(query$filepath, sep = ",")V1 <chr> | V2 <chr> | V3 <chr> | V4 <chr> | |
|---|---|---|---|---|
| dataRestriction | id | name | Study | |
| Open | syn61348404 | syn61348404 | Mouse_M005_Study_metabolomics | |
| Open | syn64020472 | syn64020472 | Mouse_M005_Study_metabolomics | |
| Open | syn64020473 | syn64020473 | Mouse_M005_Study_metabolomics |
# View the file path of the resulting csv
query$filepath[1] "/Users/mklein/.synapseCache/876/154076876/SYNAPSE_TABLE_QUERY_154076876.csv"
Now that the files have been downloaded, we can read them into R.
# Individual metadata
individual_metadata <- read_csv("files/individual_non_human_M005_Longevity Consortium_11-11-2024_final.csv", show_col_types = FALSE)
# Biospecimen metadata
biospecimen_metadata <- read_csv("files/biospecimen_non_human_M005_Longevity Consortium_11-11-2024_final.csv", show_col_types = FALSE)
# Assay metadata
assay_metadata <- read_csv("files/synapse_storage_manifest_assaymetabolomicstemplate.csv", show_col_types = FALSE)Join Metadata
Now, we join the metadata files together using left joins, matching on specimenID, then on individualID.
# join all the rows in the assay metadata that have a match in the biospecimen metadata
joined_meta <- assay_metadata |>
#join rows from biospecimen that match specimenID
left_join(biospecimen_metadata, by = "specimenID") |>
# join rows from individual that match individualID
left_join(individual_metadata, by = "individualID")
joined_meta